1,128 research outputs found
Beyond Intra-modality: A Survey of Heterogeneous Person Re-identification
An efficient and effective person re-identification (ReID) system relieves
the users from painful and boring video watching and accelerates the process of
video analysis. Recently, with the explosive demands of practical applications,
a lot of research efforts have been dedicated to heterogeneous person
re-identification (Hetero-ReID). In this paper, we provide a comprehensive
review of state-of-the-art Hetero-ReID methods that address the challenge of
inter-modality discrepancies. According to the application scenario, we
classify the methods into four categories -- low-resolution, infrared, sketch,
and text. We begin with an introduction of ReID, and make a comparison between
Homogeneous ReID (Homo-ReID) and Hetero-ReID tasks. Then, we describe and
compare existing datasets for performing evaluations, and survey the models
that have been widely employed in Hetero-ReID. We also summarize and compare
the representative approaches from two perspectives, i.e., the application
scenario and the learning pipeline. We conclude by a discussion of some future
research directions. Follow-up updates are avaible at:
https://github.com/lightChaserX/Awesome-Hetero-reIDComment: Accepted by IJCAI 2020. Project url:
https://github.com/lightChaserX/Awesome-Hetero-reI
A Survey of Dataset Refinement for Problems in Computer Vision Datasets
Large-scale datasets have played a crucial role in the advancement of
computer vision. However, they often suffer from problems such as class
imbalance, noisy labels, dataset bias, or high resource costs, which can
inhibit model performance and reduce trustworthiness. With the advocacy of
data-centric research, various data-centric solutions have been proposed to
solve the dataset problems mentioned above. They improve the quality of
datasets by re-organizing them, which we call dataset refinement. In this
survey, we provide a comprehensive and structured overview of recent advances
in dataset refinement for problematic computer vision datasets. Firstly, we
summarize and analyze the various problems encountered in large-scale computer
vision datasets. Then, we classify the dataset refinement algorithms into three
categories based on the refinement process: data sampling, data subset
selection, and active learning. In addition, we organize these dataset
refinement methods according to the addressed data problems and provide a
systematic comparative description. We point out that these three types of
dataset refinement have distinct advantages and disadvantages for dataset
problems, which informs the choice of the data-centric method appropriate to a
particular research objective. Finally, we summarize the current literature and
propose potential future research topics.Comment: 33 pages, 10 figures, to be published in ACM Computing Survey
MetaGCD: Learning to Continually Learn in Generalized Category Discovery
In this paper, we consider a real-world scenario where a model that is
trained on pre-defined classes continually encounters unlabeled data that
contains both known and novel classes. The goal is to continually discover
novel classes while maintaining the performance in known classes. We name the
setting Continual Generalized Category Discovery (C-GCD). Existing methods for
novel class discovery cannot directly handle the C-GCD setting due to some
unrealistic assumptions, such as the unlabeled data only containing novel
classes. Furthermore, they fail to discover novel classes in a continual
fashion. In this work, we lift all these assumptions and propose an approach,
called MetaGCD, to learn how to incrementally discover with less forgetting.
Our proposed method uses a meta-learning framework and leverages the offline
labeled data to simulate the testing incremental learning process. A
meta-objective is defined to revolve around two conflicting learning objectives
to achieve novel class discovery without forgetting. Furthermore, a soft
neighborhood-based contrastive network is proposed to discriminate uncorrelated
images while attracting correlated images. We build strong baselines and
conduct extensive experiments on three widely used benchmarks to demonstrate
the superiority of our method.Comment: This paper has been accepted by ICCV202
Intermediate intraseasonal variability in the western tropical Pacific Ocean: meridional distribution of equatorial Rossby waves influenced by a tilted boundary
Author Posting. © American Meteorological Society, 2020. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 50(4),(2020): 921-933, doi:10.1175/JPO-D-19-0184.1.Intermediate-depth intraseasonal variability (ISV) at a 20–90-day period, as detected in velocity measurements from seven subsurface moorings in the tropical western Pacific, is interpreted in terms of equatorial Rossby waves. The moorings were deployed between 0° and 7.5°N along 142°E from September 2014 to October 2015. The strongest ISV energy at 1200 m occurs at 4.5°N. Peak energy at 4.5°N is also seen in an eddy-resolving global circulation model. An analysis of the model output identifies the source of the ISV as short equatorial Rossby waves with westward phase speed but southeastward and downward group velocity. Additionally, it is shown that a superposition of first three baroclinic modes is required to represent the ISV energy propagation. Further analysis using a 1.5-layer shallow water model suggests that the first meridional mode Rossby wave accounts for the specific meridional distribution of ISV in the western Pacific. The same model suggests that the tilted coastlines of Irian Jaya and Papua New Guinea, which lie to the south of the moorings, shift the location of the northern peak of meridional velocity oscillation from 3°N to near 4.5°N. The tilt of this boundary with respect to a purely zonal alignment therefore needs to be taken into account to explain this meridional shift of the peak. Calculation of the barotropic conversion rate indicates that the intraseasonal kinetic energy below 1000 m can be transferred into the mean flows, suggesting a possible forcing mechanism for intermediate-depth zonal jets.This study is supported by the National Natural Science Foundation of China (Grants 91958204 and 41776022), the China Ocean Mineral Resources Research and Development Association Program (DY135-E2-3-02), and the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant XDA22000000). L. Pratt was supported by the U.S. National Science Foundation Grant OCE-1657870. F. Wang thanks the support from the Scientific and Technological Innovation Project by Qingdao National Laboratory for Marine Science and Technology (Grant 2016ASKJ12), the National Program on Global Change and Air-Sea Interaction (Grant GASI-IPOVAI-01-01), and the National Natural Science Foundation of China (Grants 41730534, 41421005, and U1406401)
Multi-level feature fusion network combining attention mechanisms for polyp segmentation
Clinically, automated polyp segmentation techniques have the potential to
significantly improve the efficiency and accuracy of medical diagnosis, thereby
reducing the risk of colorectal cancer in patients. Unfortunately, existing
methods suffer from two significant weaknesses that can impact the accuracy of
segmentation. Firstly, features extracted by encoders are not adequately
filtered and utilized. Secondly, semantic conflicts and information redundancy
caused by feature fusion are not attended to. To overcome these limitations, we
propose a novel approach for polyp segmentation, named MLFF-Net, which
leverages multi-level feature fusion and attention mechanisms. Specifically,
MLFF-Net comprises three modules: Multi-scale Attention Module (MAM),
High-level Feature Enhancement Module (HFEM), and Global Attention Module
(GAM). Among these, MAM is used to extract multi-scale information and polyp
details from the shallow output of the encoder. In HFEM, the deep features of
the encoders complement each other by aggregation. Meanwhile, the attention
mechanism redistributes the weight of the aggregated features, weakening the
conflicting redundant parts and highlighting the information useful to the
task. GAM combines features from the encoder and decoder features, as well as
computes global dependencies to prevent receptive field locality. Experimental
results on five public datasets show that the proposed method not only can
segment multiple types of polyps but also has advantages over current
state-of-the-art methods in both accuracy and generalization ability
- …